Introduction: Thrombosis remains a leading cause of morbidity and early mortality in PV. The European LeukemiaNet (ELN) classifies patients (pts) at diagnosis as high-risk according to age ≥60 years (yr) and/ or prior thrombosis, but dynamic models predicting short-term risk of initial or recurrent thrombosis are unavailable. We utilized machine learning (ML) to develop a dynamic scoring system that predicts thrombosis in PV pts using the most important of 60 clinicopathologic features.

Methods: A Random Forests ML model was trained to classify instances (3-month follow-up intervals of PV pts) as predictive or non-predictive of thrombosis, arterial or venous, in subsequent 3-6 months based on 60 features: 4 demographic, 11 history & physical, 13 treatments, 18 laboratory, and 14 pathology and molecular. The dataset was derived from Weill Cornell Medicine (WCM) Research Database Repository as previously described (Abu-Zeinah et al. Leukemia 2021) and split into training (75%) and testing (25%) sets. Hyperparameter tuning was performed to optimize model training. Synthetic minority oversampling technique (SMOTE) was implemented to reduce class imbalance since instances predictive of thrombosis were a minority. Missing data were imputed using multiple imputation by chained equations (MICE). The scoring system was developed based on ML-derived features of highest importance and confirmed by logistic regression multivariable analysis (MVA). Cumulative incidence (CI) of thrombosis was compared between risk groups using Fine-Gray model. External validation of the ML model and scoring system is underway using the Mount Sinai School of Medicine (MSSM) PV dataset. Statistical analyses and plots were performed in RStudio software v 1.4.1106.

Results: 470 PV pts at WCM were included with baseline features shown in Fig 1A. During follow-up, 159 thromboses (88 venous, 71 arterial) occurred in 115 pts. CI of thrombosis was significantly higher shortly after diagnosis, as previously appreciated (Hulcrantz et al. Ann Intern Med. 2018), and following a thrombotic event (Fig 1B-C). Bilinear fitting to CI curves identified a 2-yr breakpoint that marked the transition from a high to a much lower long-term risk after diagnosis (incidence rate (IR), per year, of 4.4% vs 1%, respectively) and after thrombosis (IR of 9.7% vs 1.8%). Of the ML model's top 10 features, 5 that independently predicted thrombosis in MVA were selected for a clinically convenient scoring system to estimate thrombosis risk (Fig 1D-E). One point was assigned for each of age ≥60 yr, prior thrombosis, WBC ≥12 x 10 9/L, peri-diagnosis (<2 yr from diagnosis), and peri-thrombosis (<2 yr from last thrombosis). Using this scoring system, we found that high-risk (Hi) and intermediate-risk (Int) pts (score ≥2 and =1) were 6.5 and 2.3 times more likely to have thrombosis, respectively, than low-risk (Lo) pts (score = 0) (p<0.001 and p=0.014). Probability of thrombosis was significantly different for Lo, Int, and Hi at 1 yr (0%, 1%, and 6%), 2 yrs (1%, 3%, and 10%), and 5 yr (2%, 9% and 21%) (Fig 1F & 1H). In contrast, ELN high-risk pts were only 2.2 times more likely to have thrombosis than ELN low-risk pts (Fig 1G-H). The concordance (C-index) of the ML-derived model (0.7± se 0.02) was higher than ELN (0.59 ± se 0.03). External validation using the MSSM PV data (Fig 1A) is ongoing.

Discussion: We applied ML to our large PV-WCM dataset to identify most important clinicopathologic features predicting thrombosis. In contrast to linear models, ML has little penalty for increasing number of parameters tested and can easily accommodate high-dimensional data to improve predictions. Because "big data" is not routinely available to caregivers, we developed a simple, dynamic scoring system predicting the risk of thrombosis in PV based on 5 most important features identified by ML. This new and dynamic scoring system outperformed ELN stratification and may prove useful in guiding treatment and improving selection of pts for clinical trials aimed at preventing thrombosis in PV pts.

Conclusion: The risk of thrombosis in PV pts is temporally non-linear and strongly influenced by proximity to diagnosis and recent thrombosis. A simple ML-derived dynamic scoring system is presented that better classifies pts into distinct Lo, Int, and Hi thrombosis risk groups based on age, prior thrombosis, WBC, peri-diagnosis, and peri-thrombosis.

Disclosures

Abu-Zeinah:PharmaEssentia: Consultancy. Silver:Abbvie: Consultancy; PharamEssentia: Consultancy, Speakers Bureau. Mascarenhas:Merck: Consultancy, Membership on an entity's Board of Directors or advisory committees, Research Funding; CTI Biopharm: Consultancy, Membership on an entity's Board of Directors or advisory committees, Research Funding; Galecto: Consultancy; Geron: Consultancy; PharmaEssentia: Consultancy, Membership on an entity's Board of Directors or advisory committees, Research Funding; Genentech/Roche: Consultancy, Membership on an entity's Board of Directors or advisory committees; Sierra Oncology: Consultancy, Membership on an entity's Board of Directors or advisory committees; Prelude: Consultancy; Celgene/BMS: Consultancy, Membership on an entity's Board of Directors or advisory committees; Merus: Research Funding; AbbVie: Consultancy, Membership on an entity's Board of Directors or advisory committees, Research Funding; Promedior: Consultancy, Membership on an entity's Board of Directors or advisory committees; Roche: Consultancy, Membership on an entity's Board of Directors or advisory committees, Research Funding; Incyte: Consultancy, Membership on an entity's Board of Directors or advisory committees, Research Funding; Kartos: Consultancy, Membership on an entity's Board of Directors or advisory committees, Research Funding; Constellation: Consultancy, Membership on an entity's Board of Directors or advisory committees; Gilead: Consultancy, Membership on an entity's Board of Directors or advisory committees; Forbius: Research Funding; Geron: Consultancy, Research Funding; Novartis: Consultancy, Membership on an entity's Board of Directors or advisory committees, Research Funding. Scandura:MPN-RF (Foundation): Research Funding; CR&T (Foudation): Research Funding; European Leukemia net: Honoraria, Other: travel fees ; Abbvie: Consultancy, Honoraria, Membership on an entity's Board of Directors or advisory committees, Research Funding; Constellation: Research Funding.

Author notes

 This icon denotes a clinically relevant abstract

Sign in via your Institution